Minimum Message Length based Mixture Modelling using Bivariate von Mises Distributions with Applications to Bioinformatics

نویسنده

  • Parthan Kasarapu
چکیده

The modelling of empirically observed data is commonly done using mixtures of probability distributions. In order to model angular data, directional probability distributions such as the bivariate von Mises (BVM) is typically used. The critical task involved in mixture modelling is to determine the optimal number of component probability distributions. We employ the Bayesian information-theoretic principle of minimum message length (MML) to distingush mixture models by balancing the trade-off between the model’s complexity and its goodness-of-fit to the data. We consider the problem of modelling angular data resulting from the spatial arrangement of protein structures using BVM distributions. The main contributions of the paper include the development of the mixture modelling apparatus along with the MML estimation of the parameters of the BVM distribution. We demonstrate that statistical inference using the MML framework supersedes the traditional methods and offers a mechanism to objectively determine models that are of practical significance.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Unsupervised Learning of Gamma Mixture Models Using Minimum Message Length

Mixture modelling or unsupervised classification is a problem of identifying and modelling components in a body of data. Earlier work in mixture modelling using Minimum Message Length (MML) includes the multinomial and Gaussian distributions (Wallace and Boulton, 1968), the von Mises circular and Poisson distributions (Wallace and Dowe, 1994, 2000) and the distribution (Agusta and Dowe, 2002a, ...

متن کامل

Modelling of directional data using Kent distributions

The modelling of data on a spherical surface requires the consideration of directional probability distributions. To model asymmetrically distributed data on a three-dimensional sphere, Kent distributions are often used. The moment estimates of the parameters are typically used in modelling tasks involving Kent distributions. However, these lack a rigorous statistical treatment. The focus of th...

متن کامل

MML mixture modelling of multi - state , Poisson , von Mises circular and Gaussian distributionsChris

Minimum Message Length (MML) is an invariant Bayesian point estimation technique which is also consistent and eecient. We provide a brief overview of MML inductive inference (Wallace and Boulton (1968), Wallace and Freeman (1987)), and how it has both an information-theoretic and a Bayesian interpretation. We then outline how MML is used for statistical parameter estimation, and how the MML mix...

متن کامل

Efficiency of the pseudolikelihood for multivariate normal and von Mises distributions

In certain circumstances inference based on the likelihood function can be hindered by, for example, computational complexity; new applications of directional statistics to bioinformatics problems give many obvious examples. In such cases it is necessary to seek an alternative method of estimation. Two pseudolikelihoods, each based on conditional distributions, are assessed in terms of their ef...

متن کامل

Some Fundamental Properties of a Multivariate von Mises Distribution

In application areas like bioinformatics multivariate distributions on angles are encountered which show significant clustering. One approach to statistical modelling of such situations is to use mixtures of unimodal distributions. In the literature (Mardia et al., 2011), the multivariate von Mises distribution, also known as the multivariate sine distribution, has been suggested for components...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016